# DG2CEP: A Density-Grid Stream Clustering Platform.

Hyuga is an implementation of DG2CEP on-line stream-processing algorithm for the ESPER CEP engine, using its EPL rules.

## Basics

The concentration (cluster) of mobile entities in a certain region, _e.g._, a mass street protest, a rock concert, or a traffic jam, is an information that can benefit several distributed applications. Nevertheless, cluster detection in _on-line scenarios_ is a challenging task, primary because it requires efficient and complex algorithms to handle the high volume of position data in a timely manner.   

To address this issue, we proposed **DG2CEP**, an on-line algorithm inspired by data mining algorithms and based on Complex Event Processing stream-oriented concepts for on-line detection of such clusters. Our experiments indicates that DG2CEP can rapidly detected, in less than few seconds, the cluster formation and dispersion. In addition, the required time to detect such clusters scale linearly with the number of nodes. Finally, regarding accuracy, several experiments shows that the cluster detected by DG2CEP presented a very high degree of similarity with the classic data mining DBSCAN density-clustering algorithm.  

The main idea of DG2CEP is to mitigate the clustering process by first mapping the position data to CEP context partitions, and then clustering the partitions rather than the nodes (using a DBSCAN-like expansion). However, this process only occurs if the given context partition has at least the minimum number _minPts_ mapped to it (as in DBSCAN core points). Further, since Context Partitions are adjacent and follow a grid-like scheme, their clustering expansion is trivial (_i.e._, the adjacent cells). The overall processing flow is illustrated below:  

![overview3.png](http://download-codeplex.sec.s-msft.com/Download?ProjectName=dg2cep&DownloadId=1459742 "overview3.png")

## Dependency
*   [Java Runtime 1.6 or higher](http://java.sun.com/);
*   [Apache Maven 3.0 or higher](https://maven.apache.org/);
*   [Context Net SDDL Communication Middleware](http://www.lac.inf.puc-rio.br/software/contextnetsddl-scalable-data-distribution-layer);
*   [ESPER CEP Engine](http://www.espertech.com) (Automatically resolved by Maven POM).

## Parameters
Hyuga (a DG2CEP ESPER implementation) uses a configuration file (following Java properties standard) for its parameters. Some parameters, such as _eps_ and _minPts_ are directly connected to DBSCAN and the semantics desired by the algorithm, while the latitude and longitude ranges represent the monitored domain. The size of the time window _win_ represents the time (seconds) in which the positions of the mobile nodes should be considered. Finally, developers can also configure the communication middleware (for distributed deployment), with the desired DDS implementation, input and output topics, and deployed EPAs.  

For example, the following configuration file assumes a DBSCAN semantic of _minPts_ 10 and a 50x50 grid (_eps_). The context partition grid is placed in the monitored region delimited by the (39.817173, 40.004673) and (116.244621, 116.558418) range, respectively the latitude and longitude intervals. Events in a timewindow (_win_) of 30 seconds are considered. It uses the OpenSplice DDS implementation and subscribes to LocationUpdatesEvent, and output CellClusterEvents. Finally, it outpus all EPAs in this machine.


```
#!bash
###################################
# Hyuga Parameters
###################################

### DBSCAN variables###############
eps    = 50
minPts = 10

# Latitude
minlat = 39.717173
maxlat = 40.211130

# Longitude
minlng = 116.047211
maxlng = 116.870327
        
# Window Period (in sec)
win = 300

### DDS parameter #################
dds         = OpenSplice
distributed = true
publish     = Hyuga
subscribe   = LocationUpdateEvent

### EPAs in this machine ##########
deploy      = ALL
output      = NONE

### Internal Arguments ############
useLowerPts = false
debug       = true

```

## Execution
Hyuga execution is done through Apache Maven. To build the system and run do:  

`$ mvn package`  
`$ mvn run`

## Authors

*   [PhD. Candidate Marcos Roriz Junior](http://www.inf.puc-rio.br/~mroriz)
*   [Prof. Dr. Francisco José da Silva e Silva](http://www.deinf.ufma.br/~fssilva/)
*   [Prof. Dr. rer. nat. Markus Endler](http://www-di.inf.puc-rio.br/~endler/)

## License
Hyuga: A Density-Grid Stream Clustering Platform.

Copyright (C) 2014 PUC-Rio/Laboratory for Advanced Collaboration

Hyuga is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

Hyuga is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with Hyuga.  If not, see <http://www.gnu.org/licenses/>.
